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1 Introduction 

Break a ruler of length n inches with indentations at each inchmark into n sticks of length 
one by repeatedly picking up all the fragments which are still of length greater than one and 
throwing them down all at once until all the sticks are of length one. Assume that each stick 
breaks into exactly two sticks uniformly at the inchmarks. What is the expected number of 
throws needed to complete the n — 1 breaks? 

We show that the answer to this question is (4.31107. . . ) ■ Inn ■ (1 + o(l)) by, as mathe- 
maticians are fond of doing, reducing it to an already solved problem. The problem that we 
reduce it to-or rather show that it is almost trivially equivalent to- is the problem of finding 
the asymptotic average height of a binary search tree. The latter problem was first solved 
by Luc Devroye[D], with a later refinement by Bruce Reed[Re]. 

We also present a possibly new (and we hope elegant) proof that (4.31107 . . . ) - Inn ■ (l + o(l)) 
is an upper hound , first proved in 1979 by Robson[Ro]. Our approach is to first prove a 
simple generating function for the number of throws needed to isolate the jth stick and then 
use Chernoff's inequality. The constant a = 4.31107. . . is the root of aln(2e/a) = 1. 

The worst case, if you are extremely unlucky, requires n — 1 throws. In this case, first the 
stick breaks into a one-inch piece and an (n — l)-inch piece, then the next throw breaks the 
(n — l)-inch stick into a one-inch piece and an (n — 2)-inch piece and so on. 

The best case (if n = 2^) is log2n,. First it breaks into two equal pieces each of length n/2 
inches, then into four pieces each of length n/4 inches etc., and in k = loggn throws into n 
one-inch pieces. 

But if you do it many times, and n is large, what is the expected number of throws? As 
we have already stated, the answer is (4.31107. . . ) ■ (Inn) ■ (1 + o(l)), so the ratio of the 
asymptotic average to the best case is only (4.31107. . . ) ■ ln2 ■ (1 + o(l)) = 2.98821 + o(l). 

Recall that a full binary tree is a '"family tree" (but with single-parenthood), with a root 
(Eve), and where every vertex either has two children (called the left-child and the right- 
child) or no children at all. Childless vertices are called leaves. The height of a full binary 
tree is the length of a maximal path from the root to a leaf. 



Each scenario of completely breaking an n-incli stick into n one-inch pieces can be naturally 
associated with a full binary tree with n leaves. The first drop of the interval [0, n] results 
either in [0, 1] and [1, n], or [0, 2] and [2, n], . . . , or [0, n — 1] and [n — l,n]. Now each of the 
two pieces goes its own way, generating its own full binary subtree. Of course, the "number 
of throws needed to completely break the stick" is the height of the corresponding full binary 
tree. 

To see all the 14 breaking scenarios for a five-inch rule go to: 
http : / / www . math . rutgers . edu/~zeilberg/ shepp/ S5 . html 

To see all the 42 breaking scenarios for a six-inch rule go to: 
http : / / www . math . rutgers . edu/~zeilberg/ shepp/ S6 . html 

Now to each internal vertex (i.e. a vertex that is not a leaf) assign the label "number of 
leaves in the subtree whose root it is" . It is obvious that the probability of the breaking 
scenario corresponding to any given full binary tree is the product of l/(z — 1) over all the 
labels of non-leaves, since a stick of length i may be broken into two smaller pieces in z — 1 
equally likely ways. 

So the expectation of the random variable "number of throws needed to completely break an 
n-inch ruler" is nothing but the expected height of all full binary trees with n leaves, under 
the above probability distribution. Let's call it a{n). 

Recall (or go to wikipedia) that any list of n different numbers gives rise to a binary search tree 
of n vertices where the first record, let's call it ii is put at the root, and its left (respectively 
right) subtree consists, recursively of the binary search tree of the sublist (in the same order) 
consisting of all entries smaller (respectively larger) than ii. If a list is empty, then it 
corresponds to the empty tree with no vertices. 

For example, the binary search tree corresponding to the permutation [4, 3, 9, 1, 5, 6, 2, 11, 7, 8, 10, 12] 
can be seen here: 

http : //www . math . rutgers . edu/~zeilberg/ shepp/BS12 . html 

It is obvious that a binary search tree corresponds to a (not-necesssarily full) binary tree, 
where every vertex may either have no children, or only a left-child, or only a right-child, or 
both left- and right- children with n vertices. 

To see all the 14 binary search trees for the four records {1, 2, 3, 4} go to: 
http : / / www . math . rutgers . edu/~zeilberg/ shepp/BS4 . html 

To see all the 42 binary search trees for the four records {1, 2, 3, 4, 5} go to: 
http : / / www . math . rutgers . edu/~zeilberg/ shepp/BS5 . html 

If you pick a permutation of {1, . . . , n} uniformly at random, it is clear that the probability 
of it having a certain binary search tree is simply the product of l/i{v), over all vertices v 
of the tree, where i{v) is defined to be "the number of vertices in the subtree rooted at v" . 
For example, the probability of the tree shown in: 
http : //www . math . rutgers . edu/~zeilberg/ shepp/BS12 . html 
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is 

(1/12)(1/3)(1/2)(1/1)(1/8)(1/4)(1/3)(1/2)(1/1)(1/3)(1/1)(1/1) 



This is the probabihty assigned to a random binary search tree, and the average is defined 
according to that probabihty distribution. 

In a famous paper in theoretical computer science, Bruce Reed[Re] improved on a seminal 
paper of Luc Devroye[D] (who only had the highest-order asymptotics) by proving that if b{n) 
is the average height of binary tree with n vertices under the above probability distribution, 
then 

b{n) = a In n — /3 In In n + O ( 1 ) , 
where a = 4.31107 ... is the unique root in [2, oo) of the equation 

aln((2e)/a) = 1 , 

and (3 = 1.953.... 

We are almost done! It is trivial to see (e.g. by induction) that a full binary tree with n 
leaves has n — 1 internal vertices (i.e. non-leaves). There is a well-known, obvious, bijection 
between binary trees with n — 1 vertices to full binary trees with n leaves. If a vertex has 
both children leave it alone. If it only has a left-child, create for it a right-child that is a 
leaf. If it only has a right-child, create for it a left-child that is a leaf. If it has no children 
(i.e. is a leaf in the original tree), create for it both a left- and right-child, both leaves. To 
go back is even easier! Remove all the leavesl 

By removing all the leaves, the height gets reduced by one, (and all the labels as well, but 
the probability stays the same, due to the different definitions in both cases) so we have the 
following relation between the stick problem and the well-known average height of a random 
search tree problem: 

a{n) = b{n- 1) + 1 . 
Of course, the asymptotics of a{n) is the same as that of b{n). 

2 A Quick Proof of the Upper Bound 

We now give a quick argument that reproves the inequality a{n) < alnn ■ (1 + o(l)), first 
proved by Robson[Ro], thereby proving half of the Devroye result. 

We hope, in the future, to extend our approach to proving the other half, namely a{n) > 
alnn ■ (1 + o(l)), thereby yielding a hopefully simpler proof, or at least an alternative one, 
of Devroye's result. 

2.1 The generating function for the time to isolate the jth stick 

Let 

Tj,n; J — 1, . . . , denote the number of throws needed until the jth inch stick is isolated 
and let /j>(p) denote the generating function of Tj^n, 
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/,,„,(p) = E [p--] . 



It is clear that /i,i(p) = 1, since ri,i = 0, and that the recurrence, 



fi,n = tY^p/mIp) 

n — 1 ^-^ 



k=\ 



holds for the generating function of the number of throws needed to isolate the jirst stick. 
An easy induction shows that 



Since the number of throws to isolate the jth stick is the sum of the number of throws needed 
until the first break at j — 1 plus the number until the first break at the jth inchmark, and 
these numbers are independent, so we see that /,>(p) = fi,j-i{p)fi,n-j{p) which gives the 
general result that 



2.2 Expectation 

We now prove that the expected total number of throws needed to completely break the 
stick is no greater than 

a ln{n + 1) — 1. 

Since maxj<„ Tj^n is the number of throws needed, this assertion is equivalent to 



The explicit formula of the generating function allows us to apply the Chernoff bound 



P{Tj,n >X) <p ""fjM 

to each rj-„. Let m = {n + l)/2. For p > 2, (1 + (p - l)/k) < (1 + so that 



n—l ^ 

/i,n,(p)=n(i+^)- 



1=1 





x=l 



no- )n(i-F 

1=1 i'=i 



) 




{j{n-j + l)} 



Let X > alnm, y = x/lnm and p = y/2. Since p > a/2 > 2, 



n 
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< 2exp (Inm — ylnmlnp + (2p — 2) Inm) 
= 2exp ((— 1 + ?/ln(2e/y)) Inm) . 

Since {d/dt){{-l + tln{2e/t)) = ln(2/t) < ln(2/a) for a < t < 

(—1 + yln(2e/y)) Inm < {y — a) ln(2/a) Inm = (x — alnm) ln(2/a). 

It follows that P(maxj<ri Tj^n > a^) < 2(2/a)^'~°'°'". Let u be the decimal part of alnm — 
ln2/ln(2/a). We find 

n— 1 n—1 

^P(maxrj-„ >x) < ^min {l, 2(2/a)^-"'°"*} 

a;=l x=l 

< alnm-ln2/ln(2/a) -n + ^(2/a)^"-". 

k=l 

Since the function —u + YlT=ii'^ / ^)''~^ convex in n G [0,1] and takes a greater value 
at M = 1 than u = 0, its maximum, 1/(1 — 2/a) — 1 = 2/(a — 2), is attained at u = 1. 
Consequently, the expectation of maxj<„ Tj- „ is no greater than 

aln(n + 1) - a In 2 - ln2/ln(2/a) + 2/(a - 2). 

The conclusion follows from —a In 2 — ln2/ln(2/a) + 2/ (a — 2) < —1. 

A slightly different simple argument using the Poisson dominance of Bernoulli yields an 
upper bound with the In In n term: 

EmaxTj^n < alnn — (/3/3) Inlnn + 0(1) 

with P = (3/2)/ log(a/2) = 1.953026. However, the coefficient for the second order term, 
/3/3, is not sharp according Reed's refinement of Devroye's result that states that 

Remark. The individual times, Tj, are rather smaller than the maximum of the r's. Indeed, 
using the generating function of tj it is not hard to show that as n — oo, 

Tj = 2 log n + \/ 2 log nrjj + o( \/ logn) 

where rjj is standard normal random variable. Thus, it is only the relative independence of 
the r's that allows the maximum to be as large as in Reed's theorem, ~ 4.13 . . . logn. 

Acknowledgement: The stick problem was inspired by the interesting paper [IK]. 

3 Maple Packages 

This article is accompanied by two Maple packges for experimenting, and simulating, both 
binary trees under the uniform distribution (not discussed in this article), and for the stick 
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problem (alias binary search trees). The most important one is BinaryTrees and the lesser 
one is ArbresBinaires. There is yet another Maple package, EtzBinary, mainly for plot- 
ting, and that's how we drew the pictures linked to above. All three packages, as well as 
some sample input and output files, can be gotten from the "front" of this article: 

http : //www. math. rutgers . edu/~zeilberg/mamarim/mamarimhtml/stick.html . 
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